Timely and Non-Intrusive Active Document Annotation via Adaptive Information Extraction

نویسندگان

  • Fabio Ciravegna
  • Alexiei Dingli
  • Daniela Petrelli
  • Yorick Wilks
چکیده

The process of document annotation for the Semantic Web is complex and time consuming, as it requires a great deal of manual annotation. Information extraction from texts (IE) is a technology used by some of the most recent systems for actively supporting users in the process and reducing the burden of annotation. The integration of IE systems in annotation tools is quite a new development and in our opinion there is still the necessity of thinking the impact of the IE system in the process of annotation. In this paper we discuss two main requirements for active annotation: timeliness and tuning of intrusiveness. Then we present and discuss a model of interaction that addresses the two issues and Melita, an annotation framework that implements such methodology.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WandaML a markup language for digital document annotation

WandaML is an XML-based markup language for the annotation and filter journaling of digital documents. It addresses in particular the needs of forensic handwriting data examination, by allowing experts to enter information about writer, material (pen, paper), script and content, and to record chains of image filtering and feature extraction operations applied to the data. We present the design ...

متن کامل

User-System Cooperation in Document Annotation Based on Information Extraction

The process of document annotation for the Semantic Web is complex and time consuming, as it requires a great deal of manual annotation. Information extraction from texts (IE) is a technology used by some very recent systems for reducing the burden of annotation. The integration of IE systems in annotation tools is quite a new development and there is still the necessity of thinking the impact ...

متن کامل

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

IXIR: A statistical information distillation system

The task of information distillation is to extract snippets from massive multilingual audio and textual document sources that are relevant for a given templated query. We present an approach that focuses on the sentence extraction phase of the distillation process. It selects document sentences with respect to their relevance to a query via statistical classification with support vector machine...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002